Goto

Collaborating Authors

 single image


Self-Supervised Intrinsic Image Decomposition

Neural Information Processing Systems

Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data. In contrast to traditional fully supervised learning approaches, in this paper we propose learning intrinsic image decomposition by explaining the input image. Our model, the Rendered Intrinsics Network (RIN), joins together an image decomposition pipeline, which predicts reflectance, shape, and lighting conditions given a single image, with a recombination function, a learned shading model used to recompose the original input based off of intrinsic image predictions. Our network can then use unsupervised reconstruction error as an additional signal to improve its intermediate representations. This allows large-scale unlabeled data to be useful during training, and also enables transferring learned knowledge to images of unseen object categories, lighting conditions, and shapes. Extensive experiments demonstrate that our method performs well on both intrinsic image decomposition and knowledge transfer.






SceneScape: Text-Driven Consistent Scene Generation

Neural Information Processing Systems

We present a method for text-driven perpetual view generation - synthesizing long-term videos of various scenes solely from an input text prompt describing the scene and camera poses.




Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly

Neural Information Processing Systems

Large language and vision models have been leading a revolution in visual computing. By greatly scaling up sizes of data and model parameters, the large models learn deep priors which lead to remarkable performance in various tasks.


One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization

Neural Information Processing Systems

The problem is challenging as it requires not only the reconstruction of visible parts but also the hallucination of invisible regions. Consequently, this problem is often ill-posed and corresponds to multiple plausible solutions because of insufficient evidence from a single image.